SOCSM^Sf wssxnsB 



ED 348 392 



TM 018 815 



AVTSBOR 
TITLE 

IHSTITITPIOK 



SFONS AGENCY 
PUB DATE 
MOTE 



PUB TYPE 



ESSS PRICE 
OESCHIPTCRS 



IDENTIFIERS 



l^, 0n9 Klmi Wright > Benjamin D. 
MatH^Batics aiu3 Raiding Teat Equating. 
Chicago Panel on Public Scbool Policy and Finance r 
XL.; Chicago PtiDllc ScboolSr 111.; Chicago Univ., IL. 
Center for School laproveaent. 
Spenner Foiuuiation* Chicago, 111. 
92 

18p.; Paper presents at the Annual Meeting of the 
A8»rican Educational Research AssTOiation (San 
Frai?2isco, CA, ^ril 20«»24, 1992). 
Reports - Evaluative/Feasibility (142) — 
Speeches/Conference Papers (150) 

HFOl/PCOl Plus Postage. 

Ability! ^^adestic Icdiieveiientf Achievement l^sts; 
Comparative Testing; wEduc^tional Change; Eleoentary 
Secondary ^ucation; ^Equated Scores; *Matheaatics 
Tests; ^Reading Tests; 'Scaling; urban Schools 
Chicago Public Schools IL; *lcma Testa of Basic 
Skills; Logits; Rasch Kodel; Refons Efforts; »Test 
Equivalence 



ABSTRACT 

As part of a larger project to assess citanges in 
student learning resulting frc^ school reform > this study equates 
levels 6 through 14 of the loathesatics and reading c(^rehension 
components of Forsi 7 of the loifa Tests of Basic Skills (ITBS) with 
levels 7 through 14 of the mathematics and reading ctwaprehenslon 
components of the CPS90 (another version of the ITBS) , using a Basch 
analysis. The analysi. results in the coim>n calibration of all 1,031 
mathematics items found in the 17 levels of the tffo test forms to 
define a mathezsatics variable and all 602 reading items to define a 
reading variable. Each item in each subject obtains a person J'ree 
calibration (in lo^its) of its own level of difficulty on one common 
scale linking all items of that subject. The 17 levels of the two 
tests were successfully equated so that a person taking the CPS90 or 
Form 7 (or a combination of items from the forms targeted at his or 
her ability level) will obtain statistically equivalent measures of 
ability. Logit measures give a more accurate picture of student rate 
of growth than do grade equivalents, with rates of growth highest at 
the lower grades and decreasing in the higher grades. Four tables, 13 
figures, and 6 references are included. An appendix lists the 
criterion definitions of variables. (SLD) 



• Reproductions supplied by EDRS are the best that can be made » 
« from the original document. * 



CMC* of e^ioMionitf flMMHth wid ft 



lit 



mraRMATION 



1M0A wpfo c tt iccd Its 
or Qfv*mEction 



OCm position 09 poikcif 



"PEHMfS^ON TO REPROWiC^ TH?S 
MATERIAL MAS SEEN C«ANTH) BY 

/yA^ 



TO THE EOlK^ATfONAL RESOURCES 
INFOTMATION CENTER (ERICJ." 



MATHEMATICS AND READING TOST EQUATING 



Qng Kiffi Lee 
University of Chicago 



B«nj«min D. Wright 
University of Chicago 



Paper presented at the annual meeting of the 
itoerican /fiucatlo^^i Research Assoliation 
April 1992, San Francisco, OA. 



BESTCOFirAfiyil 



ReadiQg sbA ifat^esMtics E({tiAting Study^ 

XntrodoeCloa 



This siMOf is part of a larger project Intended to assess school change In 
student learning as a result of school refers. In order to do this ve want to 
look at lis|>roveTOnts in students' acadeale acMemaoeat over tine. Current policy 
of the Chicago Public Schools is to change die forn of die ITBS each year. 
Sufficient nuBber of anoaalles appeared after the first change in term that 
prcopted questions on adequacy of equating by grade equivalents, at least as 
applied to Chicago schools. Uioless these test forms are equated* it is not 
possible to compare student performances from year to year to determine school 
change. The Easton. Bean, and Bryk paper (1991) points out diat earlier studies 
(Prank and Seltzer, 1990) using longitt«iinal data bases had shoim the inadequacy 
of the grade equivalent scores for determining growth. Schulz , Shen and Wright 
(1990), point out that the construction of the grade equivalent laetric is such 
that students show an average annual gain of one grade equivalent irrespective 
of their actual changes in ability. The incorporation of time into grade 
equivalents resoves the possibility of ^ietemining growth rates. 

This study equates levels 6 throu^ 14 of the Matheaatlcs and Reading 
CoBprehenslon components of tb«. Iowa Tests of Basic Skills (ITBS For» 7) with 
levels 7 throu^ 14 of the Mathenatics and itoading Cos^rehension components of 
the CPS90 (another version of the ITBS), using Rasch analysis (Wright & Douglas. 
1975, Wright, B.D., 1977, Wright & Stone, 1979). The analysis results in the 
coamon calibration of all 1031 aatheaatics *'eiBs found in the 17 levels of the 
two test forms to define a aiath variable, an . all 602 reading iteas to define a 
reading variable. Each itfflB in each subject <^tains a person<free calibration 
(in logits) of its own level of difficulty on the one cooson scale linking all 
items of that subject. 



Ttifs ^ject H a eoitdxH-etfon bstntm tte Cwtter for School l^rovensrtt uvin* tiM dfrectorship 



of PPofessw- Antft«v S. Bryk et the Unlwrof ty of CUfcase, Tfco Oilcaso tawt «n fHMic Sefceot micf end Finance 
under the directorship of J«hn 8. Easton, and the Chicafo MMfc Schools, re|»re8ented br Cerote ferUmn. and 
is n^aported by a frant free the Spencer foundation to The Chicago fm»l on Piibtfe Sdtoot Policy and Finance. 

We we special thaiAs to Professor tethony S. Bryk for his useful pointers In the eourw of the wiatysls 
and for hfs Input and coni^ on the draft of this peper. We would also tike to thank Paul Dean, John 0. 
E8ston« Kenneth Frai*, Oavid Cettow. Julia 8. mm and *rie van der Pteeg for their Ideas end cemmns. 



/ 



1 



2 




I J 



I>esigEi and Method 



Test linking in this stucfy was done with coascm persons ax«i 

design is in Figure 1 r«.k Persons and coaaon iteas. The 

6« *» iu rigure i. Each arrow represents a ffrmm ««« 

of teats. The initial deai« .-^v . ^ ^'"^ « P*^^ 

lie xnicial (Sesign took into consideration the ne«« 

-*.r of student. w™d 1„ the u™u W T^l. 

w. xTJ. "rr.'- l!!!-^-^-'- .PP.« 

respective foras. Level 14 
of Forw 7 shares 67% of its 
iteias with Uvel 13. Unking 
was strengthened by adding 
existing data^ for Levels 10 
and 12 of both Foras and 
Level 14 of Fon. 7. These 
data are froa the regular 
student testing, froo schools 
used in the stu<fy. Table 1 
lists the nuaber of iteas and 
nwiAer of students used in 
the analysis, for each of the 
test levels. 



The Calibration Matrices 



defined by standard Chicago 




The data were cleaned in four 
stages: (1) Only response 

strings narked valid as y4^^^ . „ ; — ~ . 

Flgura i Equating study Design 



ERIC 



4 



Test 




asiioiite 


fem 7 


CP890 


Fons 7 


CPS90 




ItcBB Persons 


ItesB 9ereens 


Itens Persons 


1. Level 6 

2. Lewi ? 

3. Level 8 

4. Level 9 

5. Level 10 

6. Level 11 

7. Level 12 

8. Level IS 

9. Level U 


ss 245 
/81 459 
/ 88 502 
90 282 
99 196 
109 156 
114 196 

117 sro 

121 288 


82 ses 

96 S50 
86 S80 
95 205 
101 157 
109 198 
IIS 178 
117 191 


79 218 

66 466 

67 566 

44 299 
49 227 
54 177 

57 S29 

58 2S9 


56 SSI 
61 544 
44 453 
49 2S6 
94 209 
56 2» 
97 151 
58 179 



Public Schools' proc©d«r«s*, were Included^; (2) Response strings showing 
series of seroes and/or saa» resiK>nse8 for 25% or greater of the total nusiber of 
items, were dropped; (3) Hlsfitting persons on Rasch estiisates were resoved; and 
(4) Persons with large standardised differences in perfomances on their pair of 
tests were removed. About 12% of data were lost throufj^ cleaning. After data 
cleaning, the item response strings were liidced into one giant calibration matrix 
such that strings for a person talcing two tests are aligned into the same row and 
responses to a given item fall into the same column. T^is is diagrammed in 
Figure 2. 

Tests are arranged from the lowest test levels of Form 7 and CPS90 to the 
highest. This residts in a Mathematics calibration matrix with 1031 different 
items taken by 2995 different persons, and a Reading calibration matrix with 602 
different items taken by 3159 persons. 

Notice that these calibration matrices are only 15 percent filled with data. 
Nevertheless, reliable equating was accoiq)lished from Grade 1 through Grade 8. 
Rasch equating does not need complete data to calibrate items successfully onto 
a coiaaon scale or to obtain good estimates of person measures. 



Test strfrt^ Mere flsssed when they failed evaluatfon UKi«> one or von of the foilowlns criteria: 
(1) More than 3 outtipies; (2) 50-70S Ute and > 1 enbedded onits; (3) lite and > 0 eniwdded osiits. 

^ We Mould like to thank the Otieaso Public ^moIs for doing the first jtate of cleaiing by flass^ns 
invalid response strings. 

4 



ERIC 



5 



Each matrix vas Rasch*^ analyzed In a otie*-step equating procedure and all tests 
were placed on a ocm^n logit scale. XtM» calibrations in difficulty logits» 
title log odds of an itefli provoking failure frrai a person irith ability equal to 
the scale sero. 9e nom have a bank of 1031 Mathsfiatics itess and amther bank 
of 602 Reading items* Fit statistics do not suggest the existeiK^e of diiiensions 
other than Mathematics and Reading in these tvo tests. 



CMJBRATiON MATFUX 



MATHEMATI^ 




fICMS 



nEMS 



1 



P 
E 
R 
S 
O 
N 
8 



298S 



13.783W FUXED 



ie.4S3« FILLED 




Figure 2 The Mathematics and Reading Calibration 
Matrices 



DatewBlntog Person Xeasttres Using the Matheaetlcs and leading Banks 



(a) When response strings are available 



Persons responding to any of the ITBS test levels equated here, can have their 
abilities estimated fr<» their responses by running a Kasch analysis on their 
responses irtiile anchoring the Itea difficulties on their baidc values. Aigr set 
of iteaw can be selected fron these banks to form a test targeted on a given 
group of persons, and person abilities estiaated in the s» way. A realistic 
standard error for each sieasure can be estiaated Inflated for observed person 
misfit. This is because Rasch estisiates are baaed on perfect fit and the 
standard errors for nisfitting persons tenU to be underestijaated. 

(b) When response strings are not available 

In longitudinal studies idiere tests irere iapleaented years ago, response strings 
are no longer available. The student measures therefore cannot be determined 
from an analysis of their responses. An indirect method based on their recorded 
grade-equivalents (GE's) must be used. The method is to regress the direct 
parson measures for each test level from the equating study, on their GE's for 
that test level. The person measures used were those of the individual test 
analyses of uncleaned data, with item difficulties for this step anchored on 
their bank values. The regression coefficients can then be used to predict 
student ability measures from the GE's they obtained in their earlier tests. 

Standard errors for these assures imist also be estimated. Again regression 
analysis was used. This time the dependent variables were the standard errors 
(inflated for misfit) of the measures from the direct analyses of uncleaned data. 

Mean Item Difficulty of Form 7 and CPS90 

Tables 2(a) and 2(b) show the mean item difficulty for each test level. The last 
columns of Tables 2(a) and 2(b) show the differences between the mean logit 
measures of CPS90 and Form 7. It Is clear that CPS90 is slightly harder than 
than Form 7 at most test levels. Mean test difficulties were plotted against 



— * 



llr I 



Tatlie 2ca> Wmw \% m Of ff lo^ty for 

f9m f im 




fTES 
T«8t 
levat 


Grade 




of 
I teas 


6 


K 


3J 


7 


1 


81 


6 


2 


88 


9 


S 


90 


10 


4 


«9 


11 


S 


109 


12 


6 


1U 


IS 


T 


tir 


H 


8 


121 



Itm Df ff {^ty 



-3.51 
•2.64 

-i.rr 

•\.Vf 
-0.09 
1.QS 
1.90 
2.S1 
3.07 



S.O. 



1.SS 
1.12 
1.38 
1.82 
1.11 
0.88 
0.95 
0.97 
0.98 



of 
It«» 



82 
M 
88 
95 
101 
109 
113 
117 



Itm Dtfffculty 



-2.82 
-1,48 
-t.OS 
0.07 
0.98 
1.87 
2,13 
3.22 



1.17 
1.38 
1.29 
1.20 
1.M 
1.02 
0.95 
0.95 



Ls«ft 
Olff. 

CC9-F7) 



0.02 
0.29 
0.01 
0.18 
0.09 
-0.03 
0.22 
0.15 



g...e .«» ^^^^ ^ M.the«tlc. resp.ctlv,ly. The 

aiff«e«=e in «mculcie. b.e«,„ CPS90 ™. F,„ , i^^l 9 (cLe^I 

these dlffereiKe. „ore clearly, the, were plottetf agalMt .rad. ^ 
Figures 5 and 6. P "» against grate and shonn In 

Roclce fro- Table. 2(„ ^ ^ ^^^^ 

ca branona for Reading increase with grade, that is the Reading UeJ be ^ 

ZheiT r ■»"«^'- Li-tior 

Mathematics decrease with erada *.v , 

, , * ^'^^ closer toeether in 

difficulty level at the higher test levels. This requires further l T 

as to why It is so. "quires further investigation 



7 



ERIC 



8 



REAPNGfe MEAN JTB4 DFROULTY VS GRADE 




012340676 

GRADE 



Figuro 3 Mot of tiean toodfng Urn OifffeuUfes 
agafnst (^acfe FMts 7 (laveta 6 tttrouRft f4> «nd CM90 
{Levels 7 tlirou^ 14>. 



MATH: KCAN fTB^ DFFCULTY VS GRADE 



4 •<< I p • I • I . P > > I f • • • . • 1 1 ■ < • I f < • . . , 




^ r. ■■■ . I .,. * . 1 ... I >...■(.... I w. ■ u .1 

0T2346678 



ORAOE 



Figtire 4 Plot of Hean Natheaatics Item DIfftcuitfee 
against Grade for Form 7 {Levels 6 tiirot^ U) and 
Levels (7 through 



8 

































1 00 


o 




















> 1 a 9 4 ft • r 0 1 





Figuro 5 Ptot of Dlff^reneM bettiean 
Seeding Offflcuttftd of CPS90 imi 7 
apslnst Sracie* 



OFBOULTY OFF (0©^^) V8 GRADE CivO 



i iw i> i f nn ii iMmu i Mm ip n» 




Figmrft 6 Plot 0f 0fffermc«8 tetMMn 
fMlemtles Itesn Of ff {cut ties pf CPS90 and for« 
7 ^fnst Grade* 



Maan Maasuroe and Mean QxadB SqulvaXanta of Ccw^m Faraona taking Palra of Taata 

For the cosmon parsons taking a CFS90 and a Forai 7 test at the saoa levels » 
(arrows 3, 7« 11 » 12 » and 13 In Figure I), the aean laeasuras and grade 
equivalents were calculated. Results are shown in Tables 3 and 4 for Reading and 
Mathematics . 



T^ie 3 Mam flaasuras^ l^g\ Urmto Equfval^nta, and Stmdsrd tevlatfons of Comon Paraorts 
BetMeen forts 7 and CP89Q aaadli^ Teats. 



ITBS 
Test 
iev«l 




of 

Persons 


Person mmttam 


Orada gqufvalama 


fona 7 


CP^ 


Olff Set 
{C9-F7> 


7 


1 

IV890 , 


•aiff OeC 
fW-F7> 




S.O. 


Mean 


8.0. 


Mean 


8.0. 




S.«>. 


6 

7 


K 
1 


120 


•1.80 


1.27 


♦ 

-1.85 


• 

0.66 


'Q.Qi 


1.67 


0.85 


* 

1.25 


0.51 


-0.42 


8 


2 


154 


-1.08 


0.86 


-1.00 


0.78 


0.08 


2.09 


0.76 


1.77 


0.69 


•0.32 


9 


3 


160 


O.QS 


1.01 


0.00 


0.97 


'O.QS 


3.65 


1.05 


2.95 


0.99 


•0.68 


10 


4 
















m 


• 






11 


5 


175 


1.47 


1.00 


1.50 


1.00 


0.05 


5.S1 


1.40 


5.15 


1.38 


-0.38 


12 


6 












« 












15 


r 


1*4 


2.$6 


0.83 


2.49 


0.80 


•0*07 


6.95 


1.54 


7.14 


1.55 


0.21 


U 


8 






• 












m 







o 

ERIC 



KsregpYAvmiaE 



tabic « Mwn NeaM^, Wean 8rad» equ{¥Bt«fns, and Standvd Pevlatfens of teamn »er8«iw 
SetMScn mMtB 7 and CPSfO HiitlKiiiiiitCT Tasta* 



Tast 
Lewi 


Grads 


of 




firade fqirivalants 


Form 7 


CPS90 


L«8lt 
Oiff Sot 


form 7 


CP890 


Oi 

Offf Bet 

Fom 

tC9-F7> 


♦(•art 


8.0. 




8.0. 


<CM»7> 




84». 




8.0. 


6 


X 












• 












7 


1 


90 


•2.42 


o.m 


-2.40 


0.78 


•0.00 


1.04 


0.S6 


1.41 


0.48 


•0.2S 


& 


2 


m 


-1.06 


0.«6 


-1.01 


0.94 


0.01 


In 


0.78 


2.72 


0.70 


•O.U 


9 


3 


11B 


•0.48 


1.05 


•0.3S 


1.29 


0.87 




0.84 


3.41 


0.02 


-0.41 


10 


4 






• 


• 










m 


• 




11 


5 


ISO 


1.64 


0.81 


1.«8 


0.87 


0.04 


0.17 


0.84 


5.72 


0.91 


-0.45 


12 


6 










m 










• 




13 


7 


115 


2.?3 




2.7S 


0.05 


0.00 


7.81 


1.XI 


7.25 


1.24 


-0.54 


14 


S 

























Since the saise persons took both tests, their aatched w»m neasures on 
the two tests should be statistically equivalent. It is shown graphically by 
plotting the aean masures against grade In Figure 7 (for Reading) . The sane was 
done for grade equivalents In Figure 8. Slallar plots are shown for Nathenatlcs 
in Figures 9 and 10. Note that the matched aean GE's for the saise persons are 
not the ssii« over the two test foras they took, for both the Reading and 
Hatbeiaatlcs. Students obtain hl^er grade equivalents from Fom 7 for both 
Reading and Matheaatics , except for Grade 7 Beading. This shows a bias in grade - 
equivalent equating of the IIBS. that Is, GZ*a produced by the two forms are not 
directly comparable. The GE plots are not even the straight lix^s we expect from 
GE scoring. For Grade 7 Reading, the mean Rasch loglt measure shows that CPS90 
is slightly harder than Form 7. In grade equivalents, however. Uie same students 
appear to have done better on CFS90. This apparent contradiction suggests the 
possibility that the norm group used for the CFS90 Grade 7 Reading could have 
been a less able group coo^>ared to the norm group for Form 7. Hence the same 
group of students in the equating study idien seen in terms of GE appear to have 
performed better on the CFS90 than on the Form 7 Grade 7 Reading. When compared 
in logit measures for common persons. Form 7 and CPS90 differences for all the 
grade levels are very close to zero as expected. This Is shown in Figures 11 and 
12. 



10 



F€AOHsiG KCAN N€A8URE V8 GRADE 




















1 ^ 






: ^ ACP980 




G 


► .J « < « « 4 J * 




^12840078 









Figure 7 Ft^t of Hean lte0St^ fm* ttw 8m 
Pmwi GfmvM, for 9890 and fora 7 i^frat Sracte, 
for Road{n9. 




4 0 0 7 8 

8mK 

Figure 8 Ptot of neon 66*0 of th0 $«0 Ftrson 
eroijfis for CK90 ond form 7« o^fnst Oradie, for 
ftoadino* 



MATH MEAN MEASUfE V8 GRADE 




T T. , , , 






8 








2 
















-1 
-2 














c 








> 1 


8 4 0 0 7 0 






eM3E 





Fxgttrs 9 Ptot of Hathemtles Hean MaaiM^^ 
for tho Sam Parson 6roMps« for CP890 and Fona 7 
^{nst Grade. 





MATH MEAN QE V8 GRADE 


e 








7 






• 


/f 






► / * * 






' A*' - 




• 








: CP00O ; 

fCTM? : 




1 






C 


^12840070 







ma P^raon Croups f^ CPS90 and Fom 7, against 
firada* 



11 



ERIC 



12 



?L1 



OFF {C9-F7> N MEAN kEAS ff?> V8 GRADE 



05 



02 



-04 



' * * 



for Itoailfns. 



C3FF (Oe^T) IN rwEAN QE (R> V8 CRADE 



OS i i ii i | iii> |. i M| i M,, , ,M| i n 



02 



-07 



* — 



ma Cb) Ikavi SE«« malnst «rad» 



MATH MEAN MEA8 OFF (C9-F7) V8 GRADE 


04 














s 






fi 00 


-I— — \ 








i ^ 






-04 












0 


12 8 4 8 0 7 8 




OIMK 



FI»iPBl2 Ptota «f »iff«renM betMBat 
for MatftcMBtfc*. 



MATH f^AN QE OFF (0&H=7) V8 ®V\DE 



04 



02 



-04 



t " ' > ■ ■f i iig . 



012940978 
0ME 



«nd (b> Hean fiE'a against SnOe 



The differences In mean CE's. adjueted to the logit scale using the average 
exchange of 0.8 logits per grade so that the vertical scales are all comparable, 
were also plotted against grade in Figures 11 and 12. Here the differences in 
GE's between the two test forms are much larger than zero. 



12 



13 



standard OeTiations of Measiaras and Grade Equivalents 

Froa Tables 3 ax)d 4 we see that for Reading and Matheaatics , the standard 
deviations o£ GE's increase vit^ grade while those of seasures do not. The 
spread of students in logit aeasures does not change such frms grade to grade. 
The increasing staiMlard deviations of tb» grade equivalents give the misleading 
impression that student spread increases » that thej get further apart. Figures 
13 (a) and 13(b) plot standard deviation againat grade. Note the relati-^/e 
constancy of the logit standard deviations and the systewaClc increase of the 
GE's standard Aviations across the grades. The illusion of increasing spread 



MATH mA&iM^ 6TD ¥S GRADE 



i 



ID 



04 




.^...^>.-.4 



MATH GE STD OEV V8 GRADE 


19 










g 






00 






04 

i 






\\t%A00fi 
«M6 


\ 



Figure 13<a> 9lot% of StsncfaHtf Dovfatfom of (f) NeaMTM (H) Srode EqM^^t^s m&lnst Grade for NatbGmtfcs. 



READINS MEASURE STD VS GRADE 




0 12 



PEfiSJUNQ QE STD OEV V9 (3V\t^ 



ta 



ta 



I 
i 



oa 

04 




Fisure 13Ci» Plots of StwxtarU Devfotfons of (U lw^«ures and Ctf) erade Equivalents aseinst Grade for Reading. 

13 



14 



produced by GE standard davlatlons could easily be aisuiMidrstood to prove that 
schooling Increases the dlf€eret»es aaong stiMlents. The loglt seasure plots show 
that this Is clearly not so. 

Criterion ]>ef inltlcm of Variable 

Appendix A is an exaa^le of a criterion definition of the variable called 
Nathena'clcs C<Mqmtatlon. D on the vertical axes is a licwar transfomatlon of 
itea r^tlibrations. D - 26 + S*(lteB difficulty). The vertical axis on the right 
shows the locations of the mean student ability at each grade. 

Such Iteu naps can readily be constructed once lt«&s have been calibrated, i^lch 
an Item bank of this kiiul enables. The matdi Iteas increase in c<»^lexity as the 
difficulty level Increases. This Is useful to teachers. Students' ioeasures are 
directly cosiparable to Itea difficulty calibrations. Reference to an itea map 
such as this, enables a teacher to detexalne what a stwlent has or has not 
oastered, where the student is in his aatheoatlcs education, and to plan his 
lessons acc^^rdlngly . 

Conclusion 

The 17 levels of the ITBS h«itheiBatlcs and Reading tests used In this study have 
been successfully equatfed and are each on a comnon scale of Item difficulty from 
K to 8. A person taking either CPS90 or Form 7 (or any coo^lnation of items from 
these two test forms targeted at his ability level) will obtain statistically 
equivalent measures of his ability. 

In the grade -equivalent metric, the difficulty of the test depends on the ability 
level of the normlng sasqple. A student's grade -equivalent depends on which test 
form he takes. As a result it is in^ssible to compute student abilities by 
studying the grade equivalents. Students scoring lower grade -equivalents on a 
given test may be thought to be less able, when the test may actually be harder 
or the normlng sample more able. Similarly, stxadents scoring higher grade - 
equivalents may not necessarily be of hlj^er ability since the test form may in 

14 



15 



fsct be easier or the noming BBiqpIe loss able. Using grade •equivalents results 
in aisleading interpretations o£ student perfomuuiee. These have serious policy 
isqplications. Teachers say redraoaend reandial prograiBs for 8tu<tents who do not 
actually need ches. Students may be thou^^t to have acquired tfa^ desired level 
of c(»Bpetency idien they have not. FutwiB inay be channelled to the wrong programs 
for the wrong sti^nts. 

Students' rates of growth will never be shown by gr^^e equivalents. Every year 
they are forced to have otm unit of grade -equivalent higher. A plot of GE growth 
against grade is forced close to a straight line giving the false impression that 
the rate of growth is uniform at all ages. With logit measures, however, rates 
of growth are shown to be highest at the lower grades, and to decrease in the 
higher grades. 



15 

If; 



Refersnces 



Easton, J. Q. , Dean, P., & Bryk, A. S. (1991). Sdbool change and Its correlates: 
Studying CSilcago Public Elemntary Schools after school reform. Paper 
presented at the anxmal Meeting of the Aaserican Educational Research 
Association. 

Fr£mk, K. & Seltzer, M. (1990). ifederiing growth in reading achieveBont. Paper 
presented at the anmial aeetlng of the Asierican Educational Research 
Association. 

Schulz, E. M., Shen, L. , & Wright, B. D. (1990). An equal -interval scale for 
studying reading grovth. Paper presented at the annual meeting of the 
American Educational Research Association. 

Wri^t, B. D. (1977). Misunderstanding the Rasch Model. Jftwniftl ftC fitf^gftlSjlfinfll 
Measurement. 1^(3). 

Vri^t, B. D. & Ikmglas. G. A. (1975). Better proce^hsres for saaple'free item 
analysis (Research Memoran(fan No. 20). Chicago: Ml^ Psych(»TOtric Laboratory. 

Wright, B. D. & Stone, M. H. (1979). Best teat design: Rasch Measurement. 
Chicago: MESA Press. 



16 



CaOIEBION DEroanON OF VABIABLE 



9 ittfiSKEOV 





•♦f • 



i.SS ♦ t.T9 - ▼ 



l.tTS ♦ 9.043 - ▼ 









1 
























i 






ai • »»« 




• 


u 








# 3» • ■ 




19 










• 




Grade 


1 


m - ui « 






to 


Grade 


2 


_ MAI M w 






25 












Grade 


3 






»/?• 


90 


Grade 4 


i£ffT « 14M 


















Grade 


5 








S5 






T.fS - 4.Ti « V 


it s t7«« 






Grade 


6 




V»- 
















Grade 


7 


Vt- 1/1- 


•M s • T 


sti/it-s> 


40 






Sl/7 . J 


Its « .li » T 






Grade 


8 












i T/t • a 8/1 • 


S/f SS*T 




4S 










M/f/f- 










i«i«f no* 




SO 








«»«% te W 


Sl/lt /I- 




















t VI s 7/1 • 


SO/ i 1/1 • 












U4it/S00 - 









17 



i8 



ERIC 



IBTCOPriifliiiiil 



